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DATE: 02/20/98 
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INPUT SET: S23591.mw 



This Raw Listing contains the General 
Information Section and up to the first 5 pages. 



1 SEQUENCE LISTING 
2 

3 (1) General Information: 
4 

5 (i) APPLICANT: Tripp, Cynthia Ann 

6 Frank, Glenn R. 

7 Grieve, Robert B. 
8 

9 (ii) TITLE OF INVENTION: NOVEL PARASITE ASTACIN 

10 METALLOENDOPEPTIDASE PROTEINS 
11 

12 (iii) NUMBER OF SEQUENCES: 36 
13 

14 (iv) CORRESPONDENCE ADDRESS: 

15 (A) ADDRESSEE: SHERIDAN ROSS P.C. 

16 (B) STREET: 1700 LINCOLN ST., SUITE 3500 

17 (C) CITY: DENVER 

18 (D) STATE: CO 

19 (E) COUNTRY: USA 

20 (F) ZIP: 80203 
21 

22 (V) COMPUTER READABLE FORM: 

23 (A) MEDIUM TYPE: Floppy disk 

24 (B) COMPUTER: IBM PC compatible 

25 (C) OPERATING SYSTEM: PC-DOS/MS-DOS 

26 (D) SOFTWARE: Patentln Release #1,0, Version #1,30 
27 

28 <vi) CURRENT APPLICATION DATA: 

2 9 (A) APPLICATION NUMBER: 

30 (B) FILING DATE: 

31 (C) CLASSIFICATION: 
32 

3 3 (viii) ATTORNEY /AGENT INFORMATION: 

34 (A) NAME: Connell, Gary J. 

35 (B) REGISTRATION NUMBER: 32,020 

36 (C) REFERENCE/DOCKET NUMBER: 2618-21-1-C1 
37 

38 (ix) TELECOMMUNICATION INFORMATION: 

39 (A) TELEPHONE: (303) 863-9700 

40 (B) TELEFAX: (303) 863-0223 
41 

42 

4 3 (2) INFORMATION FOR SEQ ID NO:l: 
44 

45 (i) SEQUENCE CHARACTERISTICS: 

46 (A) LENGTH: 1299 base pairs 



PAGE: 2 RAW SEQUENCE LISTING DATE: 02/20/98 

PATENT APPLICATION US/09/003,574 TIME: 09:40:42 

INPUT SET: S23S91.raw 

47 (B) TYPE: nucleic acid 

48 (C) STRANDEDNESS : single 

4 9 (D) TOPOLOGY: linear 
50 

51 (ii) MOLECULE TYPE: cDNA 

52 

53 

54 

55 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

56 

57 TTTTTTTTTT TTTTTTTTGT TTCATTGTTC AGTCAGTGGA AAATTATCGA ACGCAGAAAG 60 
58 

5 9 CATCACGAAA TACGTTAGAT CACATCAAAC AACTTATCAC CTTGAACGTA CAAAGAGAGA 120 
60 

61 TTGGAAACAT AGATGATAAG ACATTAGCTG ATGAAATAGT ATTACAACGA CGGGATCCTG 180 
62 

6 3 AGGCAAAATG GCATCATAAT GAACTATTCA TTAATGATCC AGATGCATAC TATCAAGGCG 240 
64 

6 5 ATGTCGATTT GTCGGAAAAA CAAGCCGAAA TTCTAAGCGA ACATTTTAAA AATGAAATTG 300 
66 

6 7 CTTTAACAGA GAAAGACGAC ACAATAATAC GGCGAAAAAA GAGCATTGGT CGTGAACCAT 360 
68 

6 9 TTTACGTAAG ATGGAATCAT AAACGTCCCA TTAGCTATGA ATTTGCGGAA AGTATTCCAT 420 
70 

71 TAGAAACACG TAGAAAAATT CGTTCAGCAA TAGCAATGTG GGAAGAACGA ACATGCATAC 480 
72 

7 3 GATTCCAAGA AAATGGCCCA AATGTAGATC GAATTGAATT TTACGACGGT GGCGGTTGTT 540 
74 

75 CAAGTTTTGT CGGCCGAACA GGAGGGAATT TCAATTTCAA CACCAGGATG TGATATTATT 600 
76 

77 GGTATTATAT CACATGAAAT TGGTCATACT TTAGGAATAT TTCATGAGCA AGCACGTCGT 660 
78 

7 9 GATCAAAAAA ATCATATTTT TATTAATTAC AACAATATTC CATCAAGCCG TTGGAACAAT 720 
80 

81 TTTTTTCCAT TATCAGAATA TGAAGCTGAT ATGTTTAATT TACCTTATGA TACAGGATCA 780 
82 

8 3 GTAATGCACT ATGGTTCATA CGGATTTGCA AGAAATCCGT ATGAACCAAC TATTACAACA 840 
84 

85 CGTGATAAAT TTCAACAGTA CACAATTGGG CAACGTGAAG GGCCATCATT TCTGGATTAT 900 
86 

87 GCATCTGTTA AGCTTTATCT ACAAACGCAT TAATGATATT GTTATCAAAT GGATGATAAT 960 
88 

89 TTCAATAAGT ATAAACAGCG CTTATCGTTG TACAGAACAA TGTGCTGATA TGCACTGCGA 1020 
90 

91 TCATAATGGT TATCCGGATC CTAATAATTG CGCGAAATGC TTGTGTCCAG ATGGTTTTGC 1080 
92 

93 TGGTCGTACC TGTCAATTTG TTCAATATAC ATCTTGCGGA GCTCTCATTA AGGTAAGTAT 1140 
94 

95 TGTCTTTTGA CCTCTTCTCT GACTAAAATA TAAGTTAAGC ATATGTATCT TCCGTCTAAT 1200 
96 

97 GATTTTCTTG ATTTTGATTT GTTCAATGCT CTTCTTGATA ATAATATAAA AATTTTTGAA 1260 
98 

99 AATAAAGTTA ACTTTTGGTC AAAAAAAAAA AAAAAAAAA 1299 
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100 

101 (2) INFORMATION FOR SEQ ID NO: 2: 
102 

103 (i) SEQUENCE CHARACTERISTICS: 

104 (A) LENGTH: 2126 base pairs 

105 (B) TYPE: nucleic acid 

106 (C) STRANDEDNESS : single 

107 (D) TOPOLOGY: linear 
108 

109 (ii) MOLECULE TYPE: cDNA 

110 

111 

112 

113 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

114 

115 GAAAGCATCA CGAAATACGT TAGATCACAT CAAACAACTT ATCACCTTGA ACGTACAAAG 60 
116 

117 AGAGATTGGA AACATAGATG ATAAGACATT AGCTGATGAA ATAGTATTAC AACGACGGGA 120 
118 

119 TCCTGAGGCA AAATGGCATC ATAATGAACT ATTCATTAAT GATCCAGATG CATACTATCA 180 
120 

121 AGGCGATGTC GATTTGTCGG AAAAACAAGC CGAAATTCTA AGCGAACATT TTAAAAATGA 240 
122 

123 AATTGCTTTA ACAGAGAAAG ACGACACAAT AATACGGCGA AAAAAGAGCA TTGGTCGTGA 300 
124 

125 ACCATTTTAC GTAAGATGGA ATCATAAACG TCCCATTAGC TATGAATTTG CGGAAAGTAT 360 
126 

127 TCCATTAGAA ACACGTAGAA AAATTCGTTC AGCAATAGCA ATGTGGGAAG AACGAACATG 4 20 

128 

129 CATACGATTC CAAGAAAATG GCCCAAATGT AGATCGAATT GAATTTTACG ACGGTGGCGG 480 
130 

131 TTGTTCAAGT TTTGTCGGCC GAACAGGAGG GAATTTCAAT TTCAACACCA GGATGTGATA 540 
132 

133 TTATTGGTAT TATATCACAT GAAATTGGTC ATACTTTAGG AATATTTCAT GAGCAAGCAC 600 
134 

135 GTCGTGATCA AAAAAATCAT ATTTTTATTA ATTACAACAA TATTCCATCA AGCCGTTGGA 6 60 

136 

137 ACAATTTTTT TCCATTATCA GAATATGAAG CTGATATGTT TAATTTACCT TATGATACAG 7 20 

138 

139 GATCAGTAAT GCACTATGGT TCATACGGAT TTGCAAGAAA TCCGTATGAA CCAACTATTA 780 
140 

141 CAACACGTGA TAAATTTCAA CAGTACACAA TTGGGCAACG TGAAGGGCCA TCATTTCTGG 840 
142 

14 3 ATTATGCATC TGATAAACAG CGCTTATCGT TGTACAGAAC AATGTGCTGA TATGCACTGC 900 
144 

145 GATCATAATG GTTATCCGGA TCCTAATAAT TGCGCGAAAT GCTTGTGTCC AGATGGTTTT 960 
146 

147 GCTGGTCGTA CCTGTCAATT TGTTCAATAT ACATCTTGCG GAGCTCTCAT TAAGGCGAGG 1020 
148 

14 9 AAAATGCCTG TTACGATTTC GAGCCCAAAT TATCCAAACT TCTTCAATGT TGGTGATCAA 1080 
150 

151 TGTATTTGGT TGCTTACAGC TCCACGCGTG ATTCGTAAAT TTGCAGTTTG TTGAACAATT 1140 
152 
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15 3 TCAATTACAA TGTGAAGATA CGTGTGATAA ATCCTATGTA GAAGTGAAAG CTGACGCTGA 1200 
154 

155 TTTTCGACCT ACTGGATATC GATTTTGTTG TTCGCGAGTG CCACGTCATA TTTTTCAATC 1260 
156 

157 TGCGACAAAC GAGATGGTAG TAATATTTCG CGGTTTTGGT GATGCGGGAA ATGGCTTTAA 1320 
158 

15 9 AGCTAAAATT TGGTCAAACG TAGATGATGA TATAGCTAAT ACAATTGTAA CAACTGAAAT 1380 
160 

161 GGCAAAAATT TCGGAAAAAA TACCGAAGCT AACAGTTCCA ATAGTTAAAA CTATTACCAC 1440 
162 

16 3 TCCTACAATA ACAACTACTA CTGCTTTCAT GATATCACCC AAGAAAGGCA ATGTCACCGC 1500 
164 

165 CACGAGAGTT GCTATCACTA CTACGCCGAC TACTACAATT ACTACGACTA TTGCCGGTAC 1560 
166 

167 GTACCAATCA CCGTAACTAA TAATACTACA CCTGTAGTAA GTGAAACTTT ACCATCATTG 1620 
168 

16 9 CCAGTCAAGA TTCGAAACAA AATAGGTGCA TGCGAATGTG GTGAATGGAC AGAATGGACA 1680 
170 

171 GGTCCATGCT CTCAAGAATG TGGCGGTTGC GGAAAACGTC TTCGAACACG TCAGTGTTCA 1740 
172 

17 3 TCAGATACGG AATGTAGAAC AGAAGAAAAA CGTGCGTGTG CTTTTAAGTT TGCCCATACG 1800 
174 

175 GGACTAATTT CCTTATCAAT AATGGAGAGT TTCATATACT TTGGAAGGGC TGCTGTGTTG 1860 
176 

177 GTCTATTCCG ATCGGGAGAT ATGTGTTCAG CACTTGATGA TAACGAGAAT CCATTTCTGA 1920 
178 

17 9 AATTTCTAGA ATCACTGTTG AACATGCAAG ATTCTCGAAA AAACGATAAT TTGCCTGACT 1980 
180 

181 CGAAAAAGAA GTGATTGAAT GATTCGATAA TATTGATTAA TAAAACGGGT TGTATTCTCG 2040 
182 

18 3 TCATAGAGTA TCCGTTGATG TTTTTATCCA AAAAATTCTC TTGCTTTTAA TTATTGTGAA 2100 
184 

185 TAAAACTTTT GTTTACCCAA AAAAAA 2126 
186 

187 (2) INFORMATION FOR SEQ ID NO: 3: 
188 

189 (i) SEQUENCE CHARACTERISTICS: 

190 (A) LENGTH: 191 amino acids 

191 (B) TYPE: amino acid 

192 (C) STRANDEDNESS: 

193 (D) TOPOLOGY: linear 
194 

195 (ii) MOLECULE TYPE: protein 

196 

197 

198 

199 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

200 

201 Cys Phe lie Val Gin Ser Val Glu Asn Tyr Arg Thr Gin Lys Ala Ser 

202 15 10 15 
203 

204 Arg Asn Thr Leu Asp His lie Lys Gin Leu lie Thr Leu Asn Val Gin 

205 20 25 30 
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206 

207 Arg Glu lie Gly Asn lie Asp Asp Lys Thr Leu Ala Asp Glu lie Val 

208 35 40 45 
209 

210 Leu Gin Arg Arg Asp Pro Glu Ala Lys Trp His His Asn Glu Leu Phe 

211 50 55 60 
212 

213 lie Asn Asp Pro Asp Ala Tyr Tyr Gin Gly Asp Val Asp Leu Ser Glu 

214 65 70 75 80 
215 

216 Lys Gin Ala Glu lie Leu Ser Glu His Phe Lys Asn Glu lie Ala Leu 

217 85 90 95 
218 

219 Thr Glu Lys Asp Asp Thr lie lie Arg Arg Lys Lys Ser lie Gly Arg 

220 100 105 110 
221 

222 Glu Pro Phe Tyr Val Arg Trp Asn His Lys Arg Pro lie Ser Tyr Glu 

223 115 120 125 
224 

225 Phe Ala Glu Ser lie Pro Leu Glu Thr Arg Arg Lys lie Arg Ser Ala 

226 130 135 140 
227 

228 lie Ala Met Trp Glu Glu Arg Thr Cys lie Arg Phe Gin Glu Asn Gly 

229 145 150 155 160 
230 

231 Pro Asn Val Asp Arg lie Glu Phe Tyr Asp Gly Gly Gly Cys Ser Ser 

232 165 170 175 
233 

2 34 Phe Val Gly Arg Thr Gly Gly Asn Phe Asn Phe Asn Thr Arg Met 

235 180 185 190 

236 

2 37 (2) INFORMATION FOR SEQ ID NO: 4: 
238 

2 39 (i) SEQUENCE CHARACTERISTICS: 

240 (A) LENGTH: 141 amino acids 

241 (B) TYPE: amino acid 

242 (C) STRANDEDNESS : 

24 3 (D) TOPOLOGY: linear 

244 

245 (ii) MOLECULE TYPE: protein 

246 

247 

248 

24 9 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

250 

251 He Glu Leu Asn Phe Thr Thr Val Ala Val Val Gin Val Leu Ser Ala 

252 15 10 15 
253 

254 Glu Gin Glu Gly He Ser He Ser Thr Pro Gly Cys Asp He He Gly 

255 20 25 30 
256 

257 He He Ser His Glu He Gly His Thr Leu Gly He Phe His Glu Gin 

258 35 40 45 
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